Lambda-Policy Iteration: A Review and a New Implementation

نویسنده

  • Dimitri P. Bertsekas
چکیده

In this paper we discuss λ-policy iteration, a method for exact and approximate dynamic programming. It is intermediate between the classical value iteration (VI) and policy iteration (PI) methods, and it is closely related to optimistic (also known as modified) PI, whereby each policy evaluation is done approximately, using a finite number of VI. We review the theory of the method and associated questions of bias and exploration arising in simulation-based cost function approximation. We then discuss various implementations, which offer advantages over well-established PI methods that use LSPE(λ), LSTD(λ), or TD(λ) for policy evaluation with cost function approximation. One of these implementations is based on a new simulation scheme, called geometric sampling, which uses multiple short trajectories rather than a single infinitely long trajectory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Sather Iters: Object-oriented Iteration Abstraction

Sather iters are a powerful new way to encapsulate iteration. We argue that such iteration abstractions belong in a class' interface on an equal footing with its routines. Sather iters were derived from CLU iterators but are much more exible and better suited for object-oriented programming. We motivate and describe the construct along with several simple examples. We compare it with iteration ...

متن کامل

Offset Policy: An Advanced Countertrade Practice

This paper explains offset and develops a strategic approach for the implementation of offset-policy for a buyer country. Offset emerges when a country cannot afford to pay cash for non-essential imports, and cannot get cash for many of its products. Offset arrangements are most frequently found in the defense-related sector. However, recently, it refers to a range of industrial and commercial ...

متن کامل

“Horses for Courses”; Comment on “Translating Evidence Into Healthcare Policy and Practice: Single Versus Multi-Faceted Implementation Strategies – Is There a Simple Answer to a Complex Question?”

This commentary considers the vexed question of whether or not we should be spending time and resources on using multifaceted interventions to undertake implementation of evidence in healthcare. A review of systematic reviews has suggested that simple interventions may be just as effective as those taking a multifaceted approach. Taking cognisance of the Promoting Action on Research Implementat...

متن کامل

Perceptions of Community Involvement in the Peruvian Mental Health Reform Process Among Clinicians and Policy-Makers: A Qualitative Study

Background The global burden of mental health conditions has led to the implementation of new models of care for persons with mental illness. Recent mental health reforms in Peru include the implementation of a community mental health model (CMHM) that, among its core objectives, aims to provide care in the community through specialized facilities, the community mental health centers (CMH...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1507.01029  شماره 

صفحات  -

تاریخ انتشار 2011